Execution of Compound Multi-Kernel OpenCL Computations in Multi-CPU/Multi-GPU Environments
نویسندگان
چکیده
Current computational systems are heterogeneous by nature, featuring a combination of CPUs and GPUs. As the latter are becoming an established platform for high-performance computing, the focus is shifting towards the seamless programming of these hybrid systems as a whole. The distinct nature of the architectural and execution models in place raises several challenges, as the best hardware configuration is behaviour and workload dependent. In this paper, we address the execution of compound, multi-kernel, OpenCL computations in multi-CPU/multi-GPU environments. We address how these computations may be efficiently scheduled onto the target hardware, and how the system may adapt itself to changes in the workload to process and to fluctuations in the CPU’s load. An experimental evaluation attests the performance gains obtained by the conjoined use of the CPU and GPU devices, when compared to GPU-only executions, and also by the use of data-locality optimizations in CPU environments.
منابع مشابه
Performance Portability in Accelerated Parallel Kernels
Heterogeneous architectures, by definition, include multiple processing components with very different microarchitectures and execution models. In particular, computing platforms from supercomputers to smartphones can now incorporate both CPU and GPU processors. Disparities between CPU and GPU processor architectures have naturally led to distinct programming models and development patterns for...
متن کاملTowards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems
SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. Currently available skeletons in SkePU in...
متن کاملMulti-Stage Programming for GPUs in Modern C++ using PACXX
Writing and optimizing programs for high performance on systems with GPUs remains a challenging task even for expert programmers. One promising optimization technique is to evaluate parts of the program upfront on the CPU and embed the computed results in the GPU code allowing for more aggressive compiler optimizations. This technique is known as multi-stage programming and has proven to allow ...
متن کاملMulticore/Multi-GPU Accelerated Simulations of Multiphase Compressible Flows Using Wavelet Adapted Grids
We present a computational method of coupling average interpolating wavelets with high-order finite volume schemes and its implementation on heterogeneous computer architectures for the simulation of multiphase compressible flows. The method is implemented to take advantage of the parallel computing capabilities of emerging heterogeneous multicore/multi-GPU architectures. A highly efficient par...
متن کاملObject Identification in Binary Tomographic Images Using GPGPUs
The authors present a hybrid OpenCL CPU/GPU algorithm for identification of connected structures inside black and white 3D scientific data. This algorithm exploits parallelism both at CPU and GPGPU levels, but the work is predominantly done in GPUs. The underlying context of this work is the structural characterization of composite materials via tomography. The algorithm allows us to later infe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 28 شماره
صفحات -
تاریخ انتشار 2016